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1 Introduction 

Let M be a compact metric space, equipped with a Borel measure [i and the corresponding Borel-sigma 
field. Let L p := L p (.M,/i), p > 1 denote the space of p-integrable real functions defined on M. with 
respect to fi. 

In this paper we investigate rates of contraction of posterior distributions for nonparametric models 
on geometrical structures such as 

1. Gaussian white noise on a compact metric space VW, where, for n > 1, one observes 

dX^(x) = f(x)dx + -^=dZ{x), xeM, 

where / is in L 2 and Z is a white noise on M. . 

2. Fixed design regression where one observes, for n > 1, 

Y l = f(x t ) + e l , l<i<n. 

The design points {xi} are fixed on Ai and the variables {e,} are assumed to be independent standard 
normal. 

I.Castillo's work is partly supported by ANR Grant 'Banhdits' ANR-2010-BLAN-0113-03. 
Ismael Castillo 

CNRS-LPMA Universities Paris 6 and 7, E-mail: ismael.castillo@upmc.fr 
Gerard Kerkyacharian and Dominique Picard 

Universite Paris Diderot - Paris 7, LPMA E-mail: kcrk@math.jussieu.fr, picard@math.jussicu.fr 



3. Density estimation on a manifold where the observations are a sample 

{Xi)l<i< n ~ /, 

X\ , . . . , X n are independent identically distributed .A/f-valued random variables with positive density 
function / on A4. 

Although an impressive amount of work has been done using frequcntist approachs to estimation on 
manifolds, see [21] and the references therein, we focus in this paper on the Bayes posterior measure. 
Works devoted to deeply understanding the behaviour of Bayesian nonparametric methods have recently 
experienced a considerable development in particular after the seminal works of A. W. van der Vaart, 
H. van Zanten, S. Ghosal and J. K. Ghosh [12], [27]. Especially, the class of Gaussian processes forms 
an important family of nonparametric prior distributions, for which precise rates have been obtained 
[29], see also [G] for lower bound counterparts. In [31], the authors obtained adaptive performance up 
to logarithmic terms by introducing a random rescaling of a very smooth Gaussian random field. These 
results have been obtained on [0, l] d , d > 1. Our point in this paper is to develop a Bayesian procedure 
adapted to the geometrical structure of the data. Among the examples covered by our results, we can 
cite directional data corresponding to the spherical case and more generally the case of data supported 
by a compact manifold. 

We follow the illuminating approach of [29] and [31] and use a fixed prior distribution, constructed 
by rescaling a smooth Gaussian random field. In our more general setting, we show how the rescaling is 
made possible by introducing a notion of time decoupled from the underlying space. Another important 
difference brought by the geometrical nature of the problem is the underlying Gaussian process, which 
now originates from an harmonic analysis of the data space M, with the rescaling naturally acting on 
the frequency domain. 

We suppose that M. is equipped with a positive self-adjoint operator L such that the associated 
semi-group e~ tL , t > 0, the heat kernel, allows a smooth functional calculus, which in turn allows the 
construction of the Gaussian random field. Our prior can then be interpreted as a randomly rescaled 
(random) solution of the heat equation. 

We also took inspiration on earlier work by [1], where the authors consider a symmetry-adaptive 
Bayesian estimator in a regression framework. Precise minimax rates in the L 2 -norm over Sobolev spaces 
of functions on compact connected oricntablc manifolds without boundary are obtained in [11]. We also 
mention a recent development by [2] , where Bayesian consistency properties are derived for priors based 
on mixture of kernels over a compact manifold. 

Here is an outline of the paper. We first detail in Section 2 the properties assumed on the structure 
M and the associated heat kernel allowing our construction and give examples. We then construct 
the associated Gaussian prior in Section 3 and prove some approximation and concentration properties 
typically needed to obtain rates of contraction of posterior distributions in Sections 4 and 5. We then 
prove in Section 6 upper-bound rates in the three statistical examples detailed above. Lower bounds are 
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considered in Section 7. The Appendices in Sections 8, 9 and 10 contain respectively the definition of 
Besov spaces, the proofs of entropy results and a property of measure of balls on compact Ricmannian 
manifolds. 

The notation < means less than or equal to up to some universal constant. For any sequences of 
reals (a ra )„>o and (6 n )n>0i the notation a„ ~ b n means that the sequences verify c < liminf„(& n /a rl ) < 
limsup n (6„/a„) < d for some positive constants c, d, and a n <C b n stands for lim n (6„/a„) = 0. For any 
reals a, b, we denote min(a, b) = a A b and max(a, b) = a V b. 



2 The geometrical framework 

The squared-exponential covariance kernel introduced in [31], which gives rise to a particular Reproducing 
Kernel Hilbert Space (RKHS) see [30] and Section 3 below, has in fact a natural extension to more general 
metric spaces. 

Suppose L 2 = ©fc>o%fc, where the Hk are supposed to be finite-dimensional subspaces of L 2 consisting 
of continuous functions on M., and orthogonal in L 2 . Then, the projector Pk on T-Lk is actually a kernel 
operator P k (x,y) := J2i<i<dim(H k ) e k( x ) e k(v)> where {e\} is any orthonormal basis of "H fc ; so it is 
obviously a positive-definite kernel. Also, given tp : N — > (0, +oo) and under a uniform convergence 
assumption, K v (x, y) = X)fc>o PityPkix, y) is a positive definite kernel which is the covariance kernel of 
a Gaussian process. 

Here, we will focus on the case where the subspaces Hk ='■ H\ k are the eigenspaces of a self-adjoint 
positive operator L and <p(k) = e _Afct , t > 0, so that ^ fc e ~ Xkt Pk(x, y) is actually the kernel of the 
associated semi-group e~ tL . 

This construction, under the following appropriate conditions, yields a natural generalisation of the 
squared-exponential covariance kernel on the real line. 



2.1 Compact metric doubling space 

The open balls of radius r centered in x € M. are denoted by B(x,r) and to simplify the notation we 
put fi(B(x,r)) =: \B(x,r)\. The metric is denoted by p. For simplicity, we can impose, in the abstract 
proofs that p(M) = 1 = diam(M). But of course, this is not the case in practical situations: the metric 
and the measure on a Ricmannian compact manifold is not normalized (see below 2.5). We assume that 
M has the so called doubling property: i.e. there exists a constant < D < oo such that: 

for all i6 M, < r, < \B(x, 2r)\ < 2 D \B(x, r)| (1) 



Remark 1 As a simple consequence of (1) we have : 

forallx,yeM,forallO<r<R, \B(x, R)\ < ( — ) D (1 + ^^-) D \B(y, r)\ (2) 

r R 



Moreover as M — B(x, 1), we have 1 = \B(x, 1)| < Q) D \B(x,S)\. Hence, 

forallxeM, for all < 6 < 1 1 < (^) D (3) 

//.M is connected one can prove additionally (see [9]) that there exist c > 0, j3 > 0, swc/i that : 

for all x EM, for all < S < 1 c(V < — - < (V (4) 

o |±s(x, o)| o 



2.2 Heat kernel 

For this section, we follow standard expositions for heat kernel theory: for more details see [22], [26], 
[15]. We suppose that there exists a self adjoint positive operator L defined on a domain DcL 2 dense 
in L 2 . Then — L is the infinitesimal generator of a self adjoint positive semigroup e~ tL . 

We suppose in addition that e~ tL is a Markov kernel operator i.e. there exists a non negative kernel 
Pt{x,y) {'the heat kernel') such that : 

e~ tL f(x)= [ Pt(x,y)f(y)dfi(y) (5) 

J M 

P t (x,y)=P t (y,x), (6) 
P t (x,y)dfi(y) = l, (7) 

M 

P t+S (x,y)= [ P t {x 1 u)P s {u,y)du (8) 
J M 

The following additional assumptions are central in our setting: there exist C\ > 0, c\ > 0, a > 0, 
such that for all t g]0, 1[, 



0<P t (x,y) < 



\P t (x,y)~P t (x,y')\ < 



\B(x,Vt)\\B(y,Vi)\ 

p(y,y'Y 



Ci 



B(x,V~t)\\B(y,Vt)\ 



(9) 
(10) 



One can easily prove (see [9]) that under the assumptions above, necessarily there exists positive constants 
C\ , C[ such that 

<P t (x,x) < — 



\B(x,V~t)\ ~ ' " \B(x,V~t)\ 
Remark 2 Under mild additional conditions on the space M., see [22], Section 7.8, one can actually 
prove that there exist C\,Ci > 0, c\, Oi > 0, a > 0, such that for all t s]0, 1[, 

=e * <-Pt(x,y)< 



|B(x, v / t)||S(y,Vt)| 



B(x,Vt)\\B(y,Vt)\ 



On the last display, one can actually see that the heat kernel has a behaviour of square- exponential type. 
Furthermore, it is a positive definite kernel, see below. 
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2.3 Spectral decomposition 

The assumptions above have, as a consequence ([9], Proposition 3.20) that the spectral decomposition 
of L is discrete: there exists a sequence = Ao < Ai < A2 < . . . of eigenvalues of L associated with finite 
dimensional eigcnspaces T-L\ k such that : 

Necessarily T~L\ k is a subset of C(Ai) the space of continuous functions on Ai. More precisely, the 
projectors Pn\ k are kernel operators Pk{x, y) with the following description: 

p k (x,y)= J2 

l<l<dim(Hx k ) 

as soon as {e l k , 1 < I < dim(H\ k )} is an orthonormal basis of H\ k - The Markov kernel Pt writes: 

P t (x,y)=Y / e- tXk P k (x,y) (12) 

k 

Moreover, e~ tL is a trace class operator. Using Mercer theorem, one can prove in addition that the 
convergence in the series is uniform. 



2.4 Smooth functional calculus and 'sampling-father-wavelets' 

More generally for any (very regular) <£> £ X>(R) and < 8 < 1, <P(5\fL) is a kernel operator described 
using the spectral decomposion, via the following formula: 



${6VZ)(x, y) = J2 $(S^)Pk(x, y) (13) 

k 

Our previous assumptions have the following consequences which will be important in the sequel, see [9] : 
Localization: ([9], Section 3) There exists a constant C(<2>) such that 

for all 0<6<l,Vx,yeM, \$(6VL)(x,y)\ < ,„/ „, . (14) 



From (14) one can easily deduce the symmetrical bound \(p(S^/L)(x, y)\ < 



cm 



y/\B(x,S)\\B(y,6)\ + + i " 

Father wavelet: ([9], Lemmas 5.2 and 5.4) There exist < Co < 00, 0<7 structural constants such 
that for any < S < 1 , for any A^s maximal 7^— net, there exists a family of functions : (D^)^ e A„ /S 
such that 

\DUx)\ < - / ^ , VxeM (15) 

£WI " \B(x,6)\ (i + £(£^))d+i' 



and if we define the 'low frequency' functions 



S t = © H A , 
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we have the following wavelet-type representation: 

V<p€E 1/St <p(x) = v(0\m,S)\Dl(x), (16) 



V (*&eA s , II £ ^\B(^S)\Dl(x)\\ x < sup \a 6 \ (17) 

We see on the formulae (16) and (17) that the functions \B(^,S)\D^ behave like father-wavelets, with 
coefficients directly obtained by sampling. We will see in Appendix B that these functions play an 
important role for instance to bound the entropy of various functional spaces. 

In the same spirit, we can also define an analogue of the mother wavelet. Notice that this construction 
will not appear explicitely in our Bayesian setting but will be used in the proof, see Section 9.3.2. Let 
us fix <P e T>(R), <<P, 1 = for |x| < 1/2, supp(<P) C [— 1, 1], and let us define also : 

#(a0=#(|) -#(»). 

So 

< W{x) < 1, supp(<P) C {- < |x| < 2}; for all 6 > 0, 1 = <P{8x) + £ &{2- j 8x). 

So 

/ = <p(sVI)f + £ !?(2-^VZ)/, 

j>0 

${8VL)f(x)= f $(5VL)(x,y)f(y)dn(y); ^(S2^VZ)f(x) = f ¥(62-*VL)(x,y)f(y)dn(y) 
Jm Jm 

<P(SVL)feSi: &(2- j 6VL)f € E^+i n [XW]- 1 

2.5 Examples. 

Torus case. Let Ai — S 1 be the torus equipped with the normalized Lebesgue measure. Ai is parame- 
terised by [—it, tt] with identification of n and — tt. The spectral decomposition of the Laplacian operator 
A gives rise to the classical Fourier basis, with 

Ho = span{l}; Hk = span{e lkx , e~ tkx } = span{sinkx, cos kx} 

Hence, 

dim(Ho) = 1; for all k > 1, dim(Hk) = 2 and Pk(x,y) — 2cosfc(a; — y). 



e tA (x,y) = 1 +Y / e' k2t 2co S k(x-y) = a/| £ 

7.. \ 1 * 1 ,- rii 



TT (x-y-2i,r)^ 

e 4t 



fc>l ' (£2 

Clearly, for all t > 0, e tA (x, x) > 1. It holds 

for allO <t<l, a:,yG[-7r,7r], C'^e^' 1 ^ 1 < e tA (x, y) < C^e" c£i£ ^ 

vi vt 

Here we have, for any x, y in [—it, tt], 

r 

p(x, y) = \x — y\ A (2tt — \x — y\); for all < r < tt, \B(x,r)\ — — . 

TT 
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Jacobi case. Let us now take M = [—1, 1] equipped with the measure ui(x)dx with uj(x) = (1 — x) a {l + 
xf, a > -1, /3 > -1. 

If cr(a:) = (1 — a;) 2 , then r := is a polynomial of degree 1, we put : 

-L(f) = Dj(f) = ^l=af" + Tf 

US 

The operator L is a nonnegative symmetric (in h2(w(x)dx)) second order differential operator (here and 
in the sequel, u' denotes the derivative of u). 

Using Gram Schmidt orthonormalisation (again, in ~L2(u>(x)dx)) of {x k , k G N} we get a family of 
orthonormal polynomials {itk, k G N} called Jacobi polynomials, which coincides with the spectral 
decomposition of Dj. It holds 

Dju k = [k{k - I)— + fcr']7T fc := A fc 7r fc = -k(k + 1 + a + P))ir k 

Then, for any k G N, %\ k = spa^l^'fe}, dim{'H\ k ) = 1 and 

Pk{x 7 y) = Tr k (x)ir k (y); A fe = -fc(fc + a + j3 + 1). 



e -tL(x,y) = 



a ,0 + V e- th V+ 1+a+ Mir k (x)n k (y), c 2 a p f u{x)dx = 1. 
fe>i ' ^ 



If 



then 



But 



p(x, y) = | arccosx — arccosy| = arccos(a;y + \/\ — x 2 \j\ — y 2 ) 

^\B(x,Vt)\\B(y,y/i)\ ^\B(x,y/t)\\B(y,Vt)\ 

for all x G [-1,1], 0<r <tt, |fl(aj,r)| ~r((l-i)Vr 2 ) a+1/2 ((l + i)Vr 2 f +1/2 . 
Sphere case. Let now M. = § ,l_1 C K™. The geodesic distance on S n_1 is given by 

n 

p{x,y) = cos _1 ((a;,y)), (x,y) = ^x^. 

i=l 

There is a natural measure a on §™ _1 which is rotation invariant. There is a natural Laplacian on 
gn-i ) /\g„_i = i 4 ; which is a negative self-adjoint operator with the following spectral decomposition. 
If T-Lk is the restriction to S™ _1 of polynomials of degree k which are homogeneous (i.e. P{x) = 

Yj\ a \=k "a 1 "- a = ( a li • ■ • > a «)' l a l = a i e and harmonic (i.e. Z\P = f^T = °>) WG 

have, 

PgHaJS™" 1 ) -A$n-iP = k(k + n-2)P:=\ k P 

Moreover dim('H k (S n ~ 1 )) = ^"^".T 2 = ^( n )- The space is called the space of spherical har- 

monics of order k, (and with a slight abuse of notation Tik is our 'former' H\ k ). Moreover, if (Yfci)i<i<jv fc 
is an orthonormal basis of T-Lk, the projector writes 

P k (x,y)= Y ki {x)Y ki {y). 

l<i<N k 
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Actually : 



(x.I/JeS"" 1 xS""\ P k (x,y) 



d 2-K n / 2 
v=--\- IS"" 11 - 



2 ' 1 1 r(n/2)' 
and G£ is the Gegenbauer polynomial of index v and degree k, denned for instance by its generating 
function : 

k 

Also, it holds 

\B(x,r)\ = |§™- 2 | f \sint) n - 2 dt, 

so at least for < r < ir/2 



(h^-z^—lr"- 1 < \B(x,r)\ < l^-Jr"" 1 
7r n — 1 n — 1 

and clearly 

for all < r < tt = diamiS 71 - 1 ), c 1 r n - 1 < \B(x,r)\ < car"" 1 . 
Now for all < t < 1, it holds 

k 

Ball case. Let M. = B d be the unit ball of W n . Let us consider the measure : 

W{x)dx, W(x) = (1 - \\x\\ 2 Y l - 1/2 , fi>0. 

Further dchne the operator 

Lf(x) = Af(x) - x.W{x.Wf(x)) - (2/i + d - l)x.Vf(x) 

= v^mMO- - \M 2 )W{x)Vf{x)] + \ Dy(x), with D id f(x) = (xjdi - Xidi)f{x). 

One can verify that: 

f L(f)(x)f(x)W(x)dx = - [ (1 - \\x\\ 2 )\Vf\ 2 (x)W(x)dx - 1 V / [i)y/] J («(ir. 

Let i7 fe (B d ) be the space of polynomials of degree at most k on the unit ball of R d and dchne V k (B d ) by 

II k (M)=V k (B d )($n k - 1 (B d ); then 

oo 

L 2 (B d )=0V t (4 

fc=0 

The space Vk(B d ) is an eigenspace of L. More precisely, 

fe V k (B d )^ L(f) = -k(k + d)f. 



The projector on Vfc is given by the following formula (see [10], [23]), with A = (j, + 

n + X f 1 



P n ( x ,y)=c(d^) 1 ^- j G x n ({x,y) +u^W^l^ W) {l-u 2 y- l du. 

( X ,y) = J2e tn{n+X) P n (x,y), 



e -tL, 

n>0 

The natural associated metric is 



Pm(x, y) = arccos((x, y) + yjl - 2 y / l - \\y\\ 2 ). 

Moreover, it holds 

\e (x v)\ < C(c) - P ~ c ^t — 

ri(r+^l-\\x\\^(r+^l-\\y\\^ 

But 



\B(x, r)\ ~ r d (r + y/l - \\x\\ 2 )^ ~ r d (r 2 + (1 - ||s|| 2 ))^ ~ r d (r 2 V (1 - |M| 2 ))^. 

Clearly, 

\B(x,2r)\ < 2 s \B(x,r)l 5 = d+2fj,. 

Compact Riemannian manifold, without boundary. Let M. be a compact Ricmannian manifold of dimen- 
sion n. Associated to the Riemannian structure we have a measure dx, a metric p, and a Laplacian A. 
One has : 

Af(x)g(x)dx = - f V(f)(x).V(g)(x)dx 



M J 



So —A is a symmetric non negative operator. Now the associated semigroup e is a positive kernel 
operator verifying : 

C e * < e (x, y) < C—= e * 

^J\B(x,Vi)\\B(y,Vi)\ ^\B(x,V~t)\\B(y,Vi)\ 

The main property is that, see the appendix in Section 10, 

30 < ci < c 2 < oo, such that , for all r < diam(M), c\r n < \B(x, r)\ < c 2 r n 

3 RKHS and heat kernel Gaussian process 

It is well known that for any t > 0, Pt{x, y) is associated to a RKHS M t and there is a Gaussian centered 
process (W t (x)) x ^M such that E(W t (x)W t (y)) = Pt(x,y) for any x, y in Ai. For instance, W t (x) can 
be built in the following way : 

w\x) = Y, x mix), 

! 

where ipi(-) is any orthonormal basis of M t , and {A^, i £ N} is a family of independent Gaussian 
variables with mean and variance 1. Also, M t is the isometric image of L 2 by P^ 2 = P t /2- So the 
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family {e~ Xkt ' 2 e l k , k g N, 1 < Z < dim(U\ k )} is a 'natural' orthonormal basis of H t . Hence, we have the 
following description of Wt : 

W t (x) = J2 E e-^ 9 JCi4(*). 

fc l<l<dimUk 

where X k is a family of independent Gaussian variables with variance 1. 
The RKHS H t has also the following description: 

Ht = = £ E 4e- A, = t/2 e i(x), ^|4.| 2 <+oo}, 

fc l<l<dim(H k ) k,l 

equipped with the inner product 

e 4 e - A * t/2 4,E E ^ Afct/2 4>M t = E E 

fc l<l<dim(Hk) fe l<l<dim(Hk) k l<l<dim(Hk) 

Hence, if we denote by the unit ball of H t : 

fc l<l<dim(U k ) kd 

3.1 Entropy of the RKHS 

Let (X, p) be a metric space. For e > 0, we define, as usual, the covering number N(e, X) as the smallest 
number of balls of radius e covering X. The entropy H(e, X) is by definition H (e, X) = log 2 N{e, X). 

An important result of this section is the link between the covering number N(e,M,p) of the space 
Ai, and H (e, H| , L p ) for p = 2, oo where is the unit ball of the RKHS defined in the subsection 
above. More precisely we prove in appendix B, the following theorem: 

Theorem 1 Let us fix /i > 0, a > 0. There exists £o > such that for e, t with < at and < e < eq, 

F(e,H t 1 ,L 2 )~ff(e,H t 1 ,L 00 )~JV(^,e),M).Iog- where -J— : = Jibg(-). 

£ o(r, £) y f £ 

Remark 3 Theorem 1 gives the precise behaviour up to constants, from above and below, of the entropy 
of the RKHS unit ball Hj. The constants involved depend only on M,a,[i. The mild restriction on the 
range oft arises from technical reasons in the proof of the upper-bound. As an examination of the proof 
reveals, this restriction is not needed in the proof of the lower bound. 

3.2 Uniform polynomial control for the measures of the balls 

In Appendix B, the general case is considered, but for sake of simplicity, in the sequel, we will concentrate 
on the following case where the entropy of M. has an exact polynomial control. 

As a matter of fact, in number of examples, for instance for compact Ricmannian manifolds without 
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boundary, see Appendix C, the bounds in (4) are the same and we have the following uniform polynomial 
control (Ahlfors condition, see for instance [17]). There exist C\ >,C2 > 0,d > such that 

for all x £ M, for all < r < 1, c x r d < \B(x, r)| < c 2 r d (18) 

Necessarily, d < D since using (3), we have \B(x, r)| > (r/2) D . 
In this case, Theorem 1 takes the following form. 

Proposition 1 In the case (18), we have 

-{-Y < N(e, M) < card(A.) < —(-) d (19) 
C2 e ci e 

H(e,Ul^ 2 ) ~ H(e,ml^°°) ~ (^T))" 1 ^^ ( 20 ) 
For all < e < Eq; if we suppose that for some fi > 0, a > 0, e M < at. 

Indeed, let (B(xi,e))i^i be a minimal covering of Ai; wc have 

1 = \M\ < \B(xi,e)\ < N{e,M)c 2 e d . 

Now if A e is any maximal e— net, we have: 

l = \M\>J2 > card{A e ) Cl {e/2f. 



4 Geometrical Prior: concentration function 

For the Bayesian results in the next four sections, it is assumed that the compact metric space (M,p) 
is as in Section 2.1, that the polynomial estimate (18) for volume of balls holds and that there exists a 
heat-kernel operator with the properties listed in Section 2.2. 



4.1 Prior, definition 

We consider a prior on functions constructed hierarchically as follows. First draw a positive random 
variable T according to a density g on (0, 1]. Suppose that (18) holds and assume there exists a real 
a > 1 and positive constants ci,C2,q such that, with d defined in (18), 

c 1 r 1 e- ri/! VW') < g(t) < c 2 r a e- t ~ d/2Xo ^ { - 1 ' t \ t e (0, 1]. (21) 

We show below that the choice q = 1 + d/2 leads to sharp rates. The choice of this particular form for 
the prior on t is related to the form taken by the entropy of 1HIJ-. For more discussion on this, see Section 
7. Also, note that we do not consider large values of t. This correspond to the fact that the trajectories 
of the process W l are already very smooth (in fact, infinitely diffcrcntiable almost surely). So, to capture 
rates of convergence for typical smoothness levels such as Sobolcv or Besov indexes, we only need to 
make the paths 'rougher', which corresponds to t small. 
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Given T = t, generate a collection of independent standard normal variables {A"£} with indexes 
k > 0, 1 < I < dim(H\ k ) and set, for x in J\4, 

W '(*)=E E e- Xkt,2 X l A{x)- (22) 

fe l<l<dimU k 

In the following the notation W l refers to the Gaussian prior (22) for a fixed value of t in (0, 1]. One can 
check that W* defines a Gaussian variable in various separable Banach spaces B, see [30] for definitions. 
More precisely, we focus on the cases B = (C°(A4), \\ ■ \\ao) and B = (L 2 (A4,fi), \\ ■ 1 1 2 ) . To do so, apply 
Theorem 4.2 in [30], where almost sure convergence of the series (22) in B follows from the properties of 
the Markov kernel (12). 

The full (non-Gaussian) prior we consider is W T , where T is random with density g given by (21). 
Hence, this construction leads to a prior II W , which is the probability measure induced by 

^)=E E e- XkT,2 * l A{x)- (23) 

k l<i<dim'Hfc 

4.2 Approximation and small ball probabilities 

The so-called concentration function of a Gaussian process defined below turns out to be fundamental to 
prove sharp concentration of the posterior measure. For this reason we focus now on the detailed study 
of this function for the geometrical prior. 

In this paper, the notation (B, || • ||b) is used for anyone of the two spaces 

(B,||-|| B ) = (C (M),||-|| oo ) or (B,||.||„) = (L 2 ,||.|| 2 ). 

Any property stated below with a || • | la-norm holds for both spaces. 

Concentration function. Consider the Gaussian process (22) W l , for a fixed t G (0, 1]. Its concentration 
function within B is defined, for any function wq in B, as the sum of two terms 



<,(*)= „ H in f „ ^^-logP(||ir|| B < £ ) 
/i*eH t , ||w -/i t ||i<E I 

AUe) + S*(e). 



Notice that the approximation term quantifies how well wq is approximable by elements of the RKHS 
H t of the prior while keeping the 'complexity' of those elements, quantified in terms of RKHS-norm, as 
small as possible. The term A t Wo (e) is finite for all e > if and only if wq lies in the closure in B of H t 
(which can be checked to coincide with the support of the Gaussian prior, see [30], Lemma 5.1.) It turns 
out that for the prior W l this closure is B itself, as quite directly follows from the approximation results 
below. 

In order to have a precise calibration of A? w (e), we will assume regularity conditions on the function 
wq, which in turn will yield the rate of concentration. Namely we shall assume that Wo belongs to a 
regularity class JP S (A4), s > taken equal to a Besov space 

J? S (M) = B'{M) if K=C°(M) (rcsp. B*{M) if B = L 2 ). 
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The problem of the regularity assumption in a context like here is not a simple one. We took here 
a natural generalization of the definition of usual spaces on the real line, by means of approximation 
property. For more details we refer to Appendix A. It holds if ^ is a Littlewood-Paley function (i.e. 
verifies the conditions (47) of Appendix A) and u> <E J? S (M), 

\\$(SVL)w - w \\n < CS S =:e. 



Approximation term A* (s)- regularity assumption on A4. For any wq in the Banach space B, consider 
the sequence of approximations, for 6 — > and L defined above, using (13), with a function having 
support in [0,1], 

<P(5Vl)w q = ®{ 5 \f>*) p u Xk WQ> 

A fc <<5- 2 

where P~H Xk IS the projector onto the linear space spanned by the /cth-eigenspace T-L\ k defined above. For 
any 5 > 0, the sum in the last display is finite thus <P(8\/L)wo belongs to M t . 
On the other hand, making use of the previous choice 5 s =: e, 



\ k <8-i 

<C e Xht \\Pn»wo\\l 



X k <8- 2 

^Ce^'WwoWl 



Note that ||t0o||2 < 1 if we suppose that wo is in the unit ball of & S (M.) (since necessarily ||uio||b is 
bounded by 1 and, for the case of the infinity norm, since M. is compact with /j,- measure 1). Hence, we 
proved 

K (e) < Ce t£ ~ 2/ \ if w e &„{M). (24) 
Note that this is precisely the place where the regularity of the function plays a role. 

Small ball probability S (e). Let us show in successive steps that the following upper-bound on the small 
ball probability of the Gaussian process viewed as a random element in B holds. 

Proposition 2 Fix A > 0. There exists a universal constant Eq > 0, and constants Cq,Ci > which 
depend on d, A 7 R only, such that, for any e < Eq and any t € [C\E A , 1\, 

-bgP(||W*||B<e) ^(7;) ( log J) +2 - (25) 

The steps follow the method proposed by [31]. The starting point is a bound on the entropy of the 
unit ball of H t with respect to the sup-norm, which is a direct consequence of (20) and is summarized 
by the following: 
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There exists a universal constant e\ > 0, and constants CitC-s > which depend on d, A only, such 
that, for any e < E\ and any t £ \CiE A , 1], 

logiV^H^H-IW^Cs^ (\og^j +2 . (26) 

Step 1, crude bound. Let u t be the mapping canonically associated to W* considered in [20] and, 
as in this article, set 

e n (u t ) := inf { v > 0, N(r,,E^, || ■ || B ) < 2"" 1 } 

< inf{0 < V < i, logtffoH*. || • || B ) < (n- l)Iog2}. 

By definition, the previous quantity is smaller than the solution of the following equation in rj, where we 
use the bound (26), 

Ct 2 log + 2 - = n 
rj 

that is i] = cxp{-Cn 2 T 3 t 2 T3}. Thus 

e n (u t ) < cxp{-Cn^&t2+z}, n>l. 
The first equation of [28], page 300 can be written 

supfc a efc(uj) < 32 sup k a ek{u t ). 

k<n k<n 

We have, for any k > 1 and any m > 1, 

k 2m e k (u t ) < k 2m exp{-Ck^t^z} 

< t- md {k 2 t d ) m cxp{-C{k 2 t d )^} 

< t- md v m (k 2 t d ), 

l 

where V m : x — > x m e~ is uniformly bounded on (0,+oo) by a finite constant c m (we omit the 

dependence in d in the notation). It follows that for any n > 1, 

n 2m e„(u*) < supfc 2m e fc (u*) 

< 32 sup k 2m e k {u t ) 

k < n 



< ^ 3 2 Cfrr t 



-md 



We have obtained efc(ut) < 32c m t md k 2m for any fc > 1. Lemma 2.1 in [20], itself cited from [24], can 
be written as follows. If £ n {v>t) denotes the n-th approximation number of u t as defined in [20] p. 1562, 



in(u t )<c x e fe (<)AT 1/2 (l + logfc). 

k>C2n 

From the bound on efe(w*) above one deduces, for some constant c' 7n depending only on m, for any n > 1, 

< d m r d n x - 2m . 
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Consider the definitions, for any e > and t > 0, 

n t {e) :=max{n: 4£„(u t ) > e}, cr(W^) = E [|j W^*" 1 1/2 



A sufficient condition for n t (e) to exist is Aa(W t ) > e, since £ n (ut) < = ^(VF'). So, provided 

e < 4<r(W"), the bound on £ n implies n t (e) < C m {e- l t- d ) 1 /^ m - 1 \ 

The following result makes Proposition 2.3 in [20] precise with respect to constants involving the 
process under consideration. This is important in our context since we consider a collection of processes 
{W t } indexed by t and need to keep track of the dependence in t. 

Proposition 3 Let X be centered Gaussian in a real separable Banach space (E, || ■ ||). Define n(e) and 
fj(X) as above. Then for a universal constant C4 > 0, any e < 1 A (Ao-(X)), 



logP[||X|| < e] < C 4 n{e) log 



6n(e){a{X) V 1) 



Explicit upper and lower bounds for a(W t ) are given in Appendix B, see (58)-(59). In the 'polynomial 
case', see (18), these bounds imply, uniformly in the interval of t's considered, that 1 < aiW 1 ) < e~ B 
for some B > 0, 

Combining this fact with Proposition 3 and the previous bound on nt , we obtain that for some positive 
constants C7, £3, £, for any £ < £3 and t € \CiS A , 1] 

S*(e) = -logP(|||M/ t |i B <e) < C 7 e^. (27) 

Step 2, general link between entropy and small ball. According to Lemma 1 in [18], we have, 
if G is the distribution function of the standard Gaussian distribution (sec their formula (3.19), or (3.2)), 

S\2e) + log G(A + G-V^)) < log N H t \ j| ■ ||») . 

Lemma 4.10 in [31] implies, for every x > 0, 

GiV^+G- 1 ^-*)) > 1/2. 

Take A = y / 2S t (e) in the previous display. Then for values of t, e such that (26) holds, 

s . (2£) + log '< c (i ! ) J ( log ^)' +l . 

Finally combine this with (27) to obtain the desired Equation (25) that is 
under the conditions £ < £3 and C2£^ < t < 1. 
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5 General conditions for posterior rates 

A general theory to obtain convergence rates for posterior distributions relative to some distances is 
presented in [12] and [14]. The object of interest is a function /o (e.g. a regression function, a density 
function etc.). In some cases, for instance density estimation with Gaussian priors, one cannot directly 
put the prior on the density itself (a Gaussian prior does not lead to positive paths). This is why we will 
parametrize the considered statistical problem with the help of a function wq in some separable Banach 
space (B, j| ■ ||b) of functions defined over (M,p). In some cases (e.g. regression) wq and /o coincide, in 
others not (e.g. density estimation), see examples below. As before, B is either C°(Ai) or L 2 . 
In this Section we check that there exist Borel measurable subsets B n in (B, || • ||b) such that, for some 
vanishing sequences e n and e„, some C > and n large enough, 

P(\\W T - w„\\ B < e n ) > e~ Cn ^ (28) 
P{W T £ B n ) < e - {c+4 ^ (29) 
logJV(e„,£„,|| • || B ) <ne 2 n (30) 

In Section 6, we show how this quite directly implies posterior concentration results. In [31], the authors 
also follow this approach. One advantage of the prior considered here is that, contrary to [31], the RKHS 
unit balls are precisely nested as the time parameter t varies, see (33). This leads to slightly simplified 
proofs. 

Prior mass. For any fixed function u>o in B and any e > 0, by conditioning on the value taken by the 
random variable T, 

P(||V^ t - Wo ||b < 2e) = / PQ\W t -w \\ 1l <2e)g(t)dt. 
Jo 

The following inequality links mass of Banach-space balls for Gaussian priors with their concentration 
function in B, see [30], Lemma 5.3, 

e -?l (e) < F Q\ W t _ WQ || B < 2s) < e -<,( 2£ ), 

for any wo in the support of W*. We have seen above that any /o in JP s (Ai) belongs to the support of 
the prior. It is not hard to adapt the argument to check that in fact any /o in B can be approximated 
in B by a sequence of elements in the RKHS H t and thus belongs to the support in B of the prior by 
Lemma 5.1 in [30]. Then 

P(\\W T - w \U < 2s) > [ e-<Vg(t)dt 
Jo 
r 2t E 

> / e -<,^g(t)dt, 

for some t* to be chosen. 
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The concentration function is bounded from above, under the conditions e < £3 and t in \Ci£ A , 1], 



by 



<PL(*)<C 



'+(^1 dog 5 



Set i* = fe » log i with S small enough to be chosen. This is compatible with the above conditions 
provided A > 2/s. Then for e small enough and any t € [i*,2t*], 

r 



Set S = d/(4s). One obtains, for any i G [*e,2i*] 



2rf 



<T tf e" log 



1 



</4 ( e ) < C dS • (^log 

Inserting this estimate in the previous bound on the prior mass, one gets, together with (21), for e small 
enough and q < 1 + d/2, 



¥(\\W T - wd||b < 2e) > <; e - C£ ^( log *) 



inf o(t) 
te[t*,2t*] 



> Ce 2(l-a)/ s(log 1 jl-Og-Ce-4 (log -Ce" I (log ±) 



> Ce -C'e-j(logi) i 

Condition (28) is satisfied for the choice 



logn 



(31) 



(32) 



Sieve. The idea is to build sieves using Borell's inequality. Recall here that is the unit ball of the 
RKHS of the centered Gaussian process W r , viewed as a process on the Banach space B. The notation 
Bi (as well as H*) stands for the unit ball of the associated space. 

First, notice that from the explicit form of the RKHS of W l , we have 



If t 2 >h, then H t \ C H*,. 
Let us set for M = M n , e = e n and r > to be chosen later, 

B n = AIWl + eBi, 

Consider the case t > r, then using (33) 

¥(W* $B n ) = P(W* i MM], + eBi) 

< f{W l i MM] + eBi) 

< l-G{G-\e- st ^) + M). 



(33) 



(34) 



where the last line follows from Borell's inequality. 
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Choices of e, r and M. Let us set e = e n given by (32) and 

2 

r~* ~ ~ " £ " d and Af 2 ~ ne 2 . (35) 

(lognj i+ 2 

First, one checks that r belongs to [C^e' 4 , 1]. This is clear from the definition since we have assumed 
A > 2/s. Then any t <G [r, 1] also belongs to \C^£ A , 1] so we can use the entropy bound and write 

S'(e) < Cri (log±-\ +2 < Cr~t {^og^-\ + " =: S* n . 



Now the bounds -y/2\og(l/u) < G _1 (u) < -±^f\og(l/u) valid for u e (0, 1/4) imply that 

1 - G(G~\e- st{£) ) + M) < 1 - G(G- 1 (e- S ") + M) < e" M2/8 , 

as soon as M > 4^5^ and e~ s >* < 1/4. 

To check e~ S " < 1/4 note that S* > S r {e) which can be further bounded from below using Equation 
(3.1) in [18] which leads to, for any e, A > 0, 

S r (e) > H(2e,\Ul) ^ > Cr^ (log -) 1+ ^ - ^. 

I el 

Here we have used the bound from below of the entropy see (20). Then take A = 1 to obtain S*(e) > log(4) 

for e small enough. 

The first inequality M > 4y / S^" is satisfied if 

if 1 x 1+1 
M 2 > log — 



and this holds for the choices of r and M given by (35). Hence for large enough n, 

P(H^ 4 £ B n ) < e~ M2/8 

Then we can write, if q > 1 + d/2, 



i B n ) = I P(W/ 4 £ B n )g(t)dt 



n 



< P(T < r) + / ¥(W t £ B n )g{t)dt 

J r 

<Cr- c e- c ' r - d/2l ^^+e- M2 ^ 



< e~ Cn < 



Entropy. It is enough to bound from above 



logJV(2e n> MB£ + e„Bi,|| ■ || B ) < \ogN{e n ,MWl,\\ • || B ) 

i+4 



< Cne 2 n 

where we have used (26) to obtain the one but last inequality. 
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6 Posterior rate, main results 



In the next paragraphs, we recall the definition of the Bayes posterior measure, in a dominated setting 
where the posterior is given by Bayes' formula, and state a general rate-Theorem. We then prove the 
announced results in the three considered statistical settings. We study the convergence of the posterior 
measure in a frequentist sense in that we suppose that there exists a 'true' parameter, here an unknown 
function, denoted /o- That is, we consider convergence under the law of the data under /o, and denote 
the corresponding distribution Pj£ • The expectation under this distribution is denoted E/ . 



For any densities p, q with respect to a measure denote 

K(p,q)= p log -dp, V(p, q)= p log 2 -dp,. 

J q J q 



6.1 Bayesian framework and general result 

Let & be a metric space equipped with a tr-field T . 

Data. Consider a sequence of statistical experiments (X n , Am {Pf jfe&) indexed by the space 
We assume that there exists a common (cr-finite) dominating measure /Lt' n ) to all probability measures 
{P ( f n) } f ^, that is 

We also assume that the map (x^ n \ /) — s- pf(x^ n ') is jointly measurable relative to A n ® T. Prior. We 
equip the space (j?, T) of a probability measure 77 that is called prior. Then the space X n x & can be 
naturally equipped of the cr-field A n (?) T and of the probability measure 

P(A n x T) = J J a T pf{x^)d^ n \x^)dn{f). 
The marginal in / of this measure is the prior 77. The law X\ f is Pf ■ 

Bayes formula. Under the preceding framework, the conditional distribution of / given the data is 
absolutely continuous with respect to 77 and is given by, for any measurable set T G T, 

11 j fp f (x(n))dii(f)- 

Let e n — > be a rate of convergence such that ne 2 — > +oo. Define a Kullback-Leibler neighborhood 
of the element /o in & by 



D 



KL 



(/o, £n )= {K{p^,pf)<nEl V{p^J f n) )<nEl) 



Next we state a general result which gives sufficient conditions for the convergence of the posterior 
measure. It is a slight variation on Theorem 1 in [14]. A first key ingredient is the existence of a distance 
d n enabling testing on the set of objects / of interest. Suppose that for some (semi-)distance d n on 
there exist K > 0, £ > 0, such that for any e > and any /i6# with d n (fo, fi) > 

P}?i>n < e~ Kne2 (36) 
sup P\ n \l-i> n )<e- K ™\ (37) 

/: dn(/,/l)<?£ 
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Theorem 2 (Thm. 1 in [14]) Suppose there exists tests ip n as in (36) -(37), measurable sets and 
C, d > 0, such that, for some e n — > 0, e n — > and ne 2 n — > +oo, ne 2 n — > +oo , 

(N) \ogN(e n ,^ n ,d n )<dne 2 n 
(S) n(&\& n ) < e~ (c+4) ™ £ " 

(P) n(B KL (f Q ,de n ))>e- Cn ^ 

Set e* = e„ V e n . Then for large enough M > 0, as n — > +oo, 

E fo n(f:d n (f,f )<Me* n \xW)^l. 

6.2 Applications 

Let us recall that we assume that the compact metric space A4 satisfies the conditions of Section 2 
together with the polynomial-type growth (18) of volume of balls. 

Application, Gaussian white noise. One observes 

dX {n \x) = f(x)dx + -^=dZ(x), x 6 M. (38) 

In this case we set (B, || ■ ||) = (L2, || • W2). The prior W T , see (23), here serves directly as a prior on / 
(so w — f here). 

Here the testing distance is simply d n = \\ ■ W2 the lL,2-norm in L2. Consider the test 

<Pn = 1 {2f M {f 1 (x)-f (x)}dX^(x)>\\f 1 p-\\f a p}- 

Then (36)- (37) follow from simple computations. Also, one can check using Girsanov's formula that for 
model (38), the neighborhood Bifi(/o, £«,) coincides with an L 2 -ball of the same radius. Recall that the 
definition of Besov spaces is given in Appendix A. 

Theorem 3 (Gaussian white noise on (AA,p)) Let us suppose that fo is in S| oc (A / l) with s > 
and that the prior on f is W T given by (23). Let q = 1 + d/2 in (21). Set e n ~ e n ~ (logn/n) 2s ^ 2s+d K 
Then Equations (28), (29) and (30) are satisfied with the choice (B, || • ||b) = (L 2 , || • ||2). For M large 
enough, as n — >• +00, 

E /o /7(||/-/ || 2 >Me n I Jf< B) )->0. 

Application, Fixed design regression. The observations are 

Y i = f{x i )+E i , l<i<n. (39) 

The design points {x{\ are fixed on M. and the variables {e^} are assumed to be i.i.d. standard normal. 
The prior W T , see (23), here serves directly as a prior on / (so w = f here). 
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Let us introduce the following semi-distance d n . For /i, fi in let us set 

r 1 - 

d„(/i, / 2 ) 2 = / (A - / 2 ) 2 < = - - / 2 ) 2 (x 4 ). (40) 

n i=i 

Let be the likelihood ratio-type test defined by 

= 1 {#E I " =1 (/i-/o)(^)^>iEr =1 (/ 1 2 -/ 2 )(^)}- 

This test satisfies, 

sup P) n) (l - <p n ) < G(-^d n ( Vo , m j). 

fe&, d n (f,h)<d n (f ,h)/<i 4 

Also, simple calculations show that for model (38), the neighborhood -Brtl(./o, £«) coincides with an 
L 2 (P*J-ball of the same radius, which itself contains a sup-norm ball of that radius. 

Theorem 4 (Fixed design regression on (At, p)) Let us suppose that fo is in ^(Ai), with s > 0, 
and that the prior on f is W T given by (23). Let q = 1 + d/2 in (21). Set e n ~ e n ~ (logn/n) 2 ^ 2,5 ^. 
TTien Equations (28), (29) and (30) are satisfied with (B, || • ||b) = (C (.M), || ■ ||oo)- For M large enough, 
as n — > +oo, 

E fo II(d n (f,fo)>Me* n | lW)^0. 

Application, Density estimation. The observations are a sample 

(JTt)i<i<n i-i-d. ~/, (41) 

for a density f on Ai. The true density /o is assumed to be continuous and bounded away from and 
infinity on M. In order to build a prior on densities, we consider the transformation, for any given 
continuous function w : Ai — > R, 

/*(*):= A{W{X)) , xeM, 

J M A(w{u))dn(u)' 

where A : R — > (0, +oo) is such that logyl is Lipschitz on R and has an inverse 4 _1 : (0,+oo) — ^ R. 
For instance, one can take the exponential function as A. Here, the function wq is taken to be wo := 
vl _1 /o. The prior W T , see (23), here serves as a prior on w's, which induces a prior on densities via 
the transformation f£. That is, the final prior LJ on densities we consider is f^ T - In this case we set 
(B, || • ||) = (C°(A4), || • ||oc)) the Banach space in which the function w and the prior live. 

— Testing distance. It is known from [3] and [19] that for any two convex sets Pq and P\ of probability 
measures, there exist tests tp n such that, with h the Hellinger distance, 

sup P n ip n < exp(nlog{l - \h 2 {P , P x )}) (42) 
PeVo z 

sup P"(l - tp n ) < cxp(nlog{l - lh 2 (P ,Pi)}) (43) 

PGPi 1 
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where h(T'o,T'i) is the infimum of h(Po,P\) over Pq £ Vq and P\ £ V\. So, setting d n = h, with the 
help of the inequality log(l — x) < ~x, we get that (36)-(37) hold. 
The following is a slight extension of [31], Lemma 3.1 

Lemma 1 For any measurable functions v,w, and a positive function A on K such that log A is a 
L-Lipschitz function on R, there exists a universal constant C such that 

K(tf t ft) < CL\\v - W ||ooe l|M|U/2 (l + 2L\\v - HU) 
V(f A , f A ) < CL\\v - ^H^ell— 'H-/ 2 (l + 2L\\v - HI-) 2 . 

With this lemma we see that properties (30)- (29)- (28) automatically translate up to multiplicative 
constants into properties (N). (S), (P) where d n is Hellinger's distance. 

Proof The Hellinger distance between f A and ft can be written 



h(f A f A ] = ll VMv) VMw) .I || A /lW-VlHlb 

The inequality e x < 1 + xe x for x > implies 



y/Mv) - V^H = V^M e ( lo S^^)-logvl( TO ))/2 _ 1 

< |V^Rll«-^llooe l|M|| - /2 . 
This leads to the first inequality. By Lemma 8 in [13], for any densities p, q 

K(p, q) < h{p, q) (l + log II^U) , V(p, q) < h(p, q) (l + log \\^\\, 
Since logyl is L-Lipschitz, we have that 

e -L\\v-w\\°* [ A{w)dfx < f A{v)dpL < e L ^»- w ^ f A{w)dtx. 



Inserting this into the following chain of inequalities, 

log||^||oo<||log&|oo 
J w J u 



This proves the Lemma. □ 

Theorem 5 (Density estimation on (M,p)) Suppose the true density fo is a continuous function 
bounded away from and +oo over M . Let the prior LJ be the law on densities induced by f^ T , with W T 
defined by (23). Let q = 1 + d/2 in (21) and A be a positive invertible function with log A Lipschitz on 
R. Let us suppose that A~ x fo is in Bf^ ^{M.), s > 0. Set e n ~ e n ~ (logn/n) s ^ 2s+d \ Then Equations 
(28), (29) and (30) are satisfied with (B, || • |b) = (C°(yVl), || • |oo)- For AI large enough, as n — > +oo, 



E fo n(h(f,f )>Ms* n i X("')->0. 
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Proofs. The proofs of Theorems 3 and 4 directly follow from the results in Section 5. Indeed, for the 
white noise case, apply Theorem 2 with J? = L 2 , d n = || ■ I2, & n the s °t B n defined in Section 5 and 
/o = Wq G ^loo- The regression case is similar with & = C°(A4) equipped with the sup-norm and 

In the density case, the key property is the following inclusion, for some c > and any small enough 
e > (recall that u> = A~ x fo so that fo = f A Q ) 

{f£: \\w-w \\ s <e} C {/: d„(/,/o) <ce}nS K L(/o,c£). (44) 

This means that the testing distance d n on densities and the KL-divcrgcncc properly relate to the Banach 
space distance || • ||b on the set of w's. Now the inclusion property (44) is clear in view of the definition 
of Bkl and Lemma 1. Therefore, the inequalities (28)-(29)-(30) of Section 5 related to functions w 
automatically translate into the properties (N), (S), (P) for functions / needed for Theorem 2. The 
remainder of the proof is as for the white noise and regression case. □ 

In the case that M. is a compact connected orientablc manifold without boundary, minimax rates of 
convergence have been obtained in [11], where Sobolev balls of smoothness index s are considered and 
data are generated from a regression setting. In particular, in this framework, our procedure is adaptive 
in the minimax sense for Bcsov regularities, up to a logarithmic factor. 

We have obtained convergence rates for the posterior distribution associated to the geometrical prior 
in a variety of statistical frameworks. Obtaining these rates does not presuppose any a priori knowledge 
of the regularity of the function fo. Therefore our procedure is not only nearly minimax, but also nearly 
adaptive. 

Note also that another attractive property of the method is that it does not assume a priori any 
(upper or lower) bound on the regularity index s > 0. This is related to the fact that approximation is 
via the spaces M t , which are made of (super)-smooth functions. 

7 Lower bound for the rate 

Works obtaining (nearly-)adaptive rates of convergence for posterior distributions are relatively recent 
and so far were obtained for density or regression on subsets of the real line or the Euclidian space. Often, 
logarithmic factors are reported in the (upper-bound) rates, but it is unclear whether the rate must 
include such a logarithmic term. We aim at giving an answer to this question in our setting by providing 
a lower bound for the rate of convergence of our general procedure. This lower bound implies that the 
rates obtained in Section 6 are, in fact, sharp. One can conjecture that the same phenomenon appears 
for hierarchical Bayesian procedures with randomly rescaled Gaussian priors when the initial Gaussian 
prior has a RKHS which is made of super-smooth functions (e.g. infinitely differentiable functions), for 
instance the priors considered in [31], [25]. 

For simplicity we consider the Gaussian white noise model 

dX^(x) = f{x)dx + -=dZ(x), xeM. 
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We set (B, || • ||) = (h 2 (M), || • 1 1 ^ ) . As before, for this model the prior sits on the same space as the 
function / to be estimated, so w = f. 

Theorem 6 (Gaussian white noise on (A4,p), lower bound) Let e n = (\ogn/n) s ^ 2d+sS> for s > 
and let the prior on f be the law induced by W T , see (23), with q > in (21). Then there exist /o in the 
unit ball of ^(A'l) and a constant c > such that 

(11/ - fob < cQogn) ^- 1 -*^ | XW) 0. 

As a consequence, for any prior of the type (23) with any q > in (21), the posterior convergence 
rate cannot be faster than e n above. If q is larger than 1 + d/2, the rate becomes even slower than e n . 

Remark 4 More generally, an adaptation of the proof of Theorem 6 yields that, for any 'reasonable ' 
prior on T, in that, for s n ~ (logn/n) s '^ 2d,+s ' , it holds 

n[\\f-foh<e n ]>e- Cn ^, 

then U[||/ — /0II2 < ce„| A] — > for small enough c > 0. This condition is the standard 'prior mass' 
condition in checking upper-bound rates, see (28). Note that the previous display is automatically implied 
if the prior satisfies LT[\\f — /o 1 1 2 < £*J > e~ Cn£ ™ for e* = n - s /( 2d + s ) > or more generally for any rate at 
least as fast as e n . For instance, this can be used to check that taking a uniform prior on (0, 1) as law 
for T leads to the same lower bound rate. 

Proof We use a general approach to prove lower bounds for posterior measures introduced in [6] (see 
[G], [7] for examples). The idea is to apply the following lemma (Lemma 1 in [14]) to the sets {/ € 
®i 11/ - /o||b < Cn}, for some rate Cn -> and / in B| j00 , with s > 0. 

Lemma 2 If a n — > and na^ — > +00 and if B n is a measurable set such that 

n(B n )/n(B KL (f ,a n )) < e- 2 " Q ", 
then E h n(B n \ X^) -> as n -> +00. 

In our context this specializes as follows. Let a n — > and na 2 t — > +00. Suppose that, as n — > +00, 

g(||/-/o|| 2 <Cn) = . -Jtac^v 

V7(||/-/o|| 2 <a„) 1 y 
Then C, n is a lower bound for the rate of the posterior in that, as n — > +00, 

E /o 77(||/ - / || 2 < C„) I AW) ^ 0. 

We first deal with the case where q < 1 + d/2. In this case let us choose a n = 2e„, where e n = 
(\ogn/n) 2s /( 2s+d \ In Section 5, we have established in (31) that, for the prior W T with q < 1 + d/2 in 
(21), there exists C > with 

IT(\\f - foh < e n ) = P(\\W T - wqIIb < £„) > e- c " £ ". 
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So it is enough to show that, for some well-chosen £„ — > 0, 

i7(||/-/o|| 2 <Cn)) = o(e-( 8 + c )^). (45) 

We would like to take ( n = ce n , for some (small) constant c > 0. In order to bound from above the 
previous probability, we write 

n\\\f - hh < Cn] = / n[\\w* - foh < ( n }g(t)dt 

Jo 

< f exp [-$„(&,)] fl (t)dt. 
Jo 

We separate the above integral in two parts. The first one is 71 := {/i n < t < Bt n }, where t* is a similar 
cut-off as in the upper-bound proof t* = Cn/ a log(l/£ n ). On 71, one can bound from below tp^^n) by 
its small ball probability part ip (( n ). Moreover, thanks to relation (3.1) in [18], we have, for any A > 
and t G (0,1], 

¥>S(C») = - logP[||W*|| a < Cn] > H(2( n , AHf 1 , || ■ || 2 ) - y ■ 
Set A = 1 and recall from Remark 3 that the lower bound on the entropy can be used for any t regardless 
of the value of e. This yields, for large enough n, if Cn = o(l), 

^o(Cn) > C( J Bt;)- rf / 2 log 1 +' i / 2 (l/C„) -\> CB- d / 2 C d/s log(l/C„). 

Thus we obtain 

r Bt* r Bf 

/ exp [-^(Cn)] 9{t)dt < e -CB-«V io si i /M / g{t)dt 
Jo Jq 

< e -CB- d '' 2 C <i/ii log(l/C„)_ 

This is less than e~^ +G ^ ne " provided („ = ne n and k > is small enough. 

It remains to bound the integral from above on 7S> := {Bt* n < t < 1}. Here we bound <£>^ (Cn) from 
below by its approximation part. For any t € 72, 

w l f (Cn) > ~ ■ inf \\h\\l . 

V So V^n> _ 2 || h _ /o || 2<fn 11 llH ' 

We prove in Appendix B, see Theorem 7, that there exist constants c, C and /o in B| ^(Al) such that 

V4(C«)>GC„V^ 2/Si . (46) 

Now, under (46) for the previous fixed function /o, taking Cn = for small (but fixed) enough k, 
it holds, when t belongs to 72, 

^ (Cn) >C( K e n ) a e CK ~ 2/SBl °^^. 

For k small enough, this is larger that any given power of e„. In particular, it is larger than (8+C)ne 2 if the 
(upper-bound) rate e„ is no more than polynomial in n, which is the case here since e n = (log n/n) s / ( 2s +p) . 
We have verified that (45) is satisfied, which gives the desired lower bound result when q < l + d/2 using 
Lemma 2. 

In the case that q > 1 + d/2, the proof is the similar, except that the exponent of the logarithmic 
factor in (31) has now the power q — d/2, due to the assumption on the prior density g, and that e n is 
now replaced by e n = (logn) 9_1_ ^e„. 
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8 Appendix A: Besov spaces 

We follow the paper [9] to introduce the Besov spaces in this setting with s > 0, 1 < p < oo and 
< q < oo. To do so, let us introduce a (Littlewood-Paley) function <P G C°°(]R+) such that 

supp#C [0,2], <P>0fori/>l, |<P(A)| = 1 for AG [0,1], (47) 

Set ^j(A) := #(2- J 'A) for j > 1. 

Definition 1 Let ,s > 0, 1 < p < oo, and < q < oo. The Besov space B s pq = B^ q (L) is defined as the 
set of all / € h p (M, /x) such that 

ll/K := (E (2^ll^(VI)/(-) - f(-)h p ) 9 ) 1/9 < oo. (48) 

Here the £ 9 -norm is replaced by the sup- norm if q = oo. 

Remark 5 This definition is independent of the choice of<!>. Actually i/E t (/) p denotes the best approx- 
imation of f G L p from S t , that is, 

Mf) P -= inf 11/ -<?IIp- 

(here L°° is identified as the space UCB o/ aZ/ uniformly continuous and bounded functions on M) then 
B s pq ■= {.f e h p /\\f\\ A . q := ||/|| p + (J2 {2 s m 2J (f) p y) 1/q < oo}. 

j>0 

9 Appendix B: Entropy properties 

9.1 Covering number, entropy, e— net. 

Let (X,p) be a metric space. For e > the covering number N(e,X) is the smallest number of balls of 
radius e covering X. The entropy H(e,X) is by definition H(e,X) = log 2 N(e,X). 
An important result of this section paper is the link between the covering number N(e,A4, p) of the 
space Ai, and H(e, _B,L P ) the entropy number of the unit ball B of some functional space, computed in 
the V metric. 

An e— net A C X is a set such that £ ^ G vl implies £') > e. A maximal e— net A, is a an 

£— net such that for all x G X \ A, AD {x} is no more an e— net. So, for a maximal e— net, yl we have : 

X C U Sejl B(£, e), €A=> B(£, e/2) n e/2) = 0. 

Hence, for j4 £ a maximal e— net we have : 

N(e/2,X) > card(A £ ) > N(e,X). 



26 



Now if (X, p) is a doubling metric space then we have the following property : If xi, . . . , xn S B(x, r) 
are such that, p(xi,Xj) > r2~ l (I g N) clearly B(x,r) C B{x t ,2r) = B{x t ,2 l+2 {r2~ 1 ' 1 )) and the balls 
B(xi,r2^ 1 ^ 1 ) are disjoint and contained in B(x,2r). so : 

N 

N2~( l+ V D \B{x,r)\ <J2\B{x l7 r2~ l - 1 )\ < B(x,2r)\ < 2 D \B{x,r)\ (49) 
i=i 

If A r2 -i is any r2-'-net then : Card{A r2 -i) < 2 ( - l +^ d N(X,r). So if A e is any maximal e— net and for 
I £ N, A 2 i e is any maximal 2 l e— net then : 

iV(X, e2') < N{X, e) < CW(yl £ ) < 2 < - l+ ^ D N(X, 2 l e) < 2 < - l+ ^ D Card{A 2 i £ ). (50) 

For I = 

2- 3D Card(yl e ) < iVpST.e) < Card(yl e ). 

So for any e > 0, and for any maximal e— net A s , Card{A e ) and N(X, e) are of the same order. 
Moreover clearly, taking r = 1 in (49), so that -B(x, 1) = Ai, we get: 

N(S,M)<4 D {l) D (51) 
o 



Remark 6 In number of examples (for instance for compact Riemannian manifolds) there exist absolute 
constants : c\ >, c 2 > 0, d > such that 

for all x € M, /or a/Z < r < 1, cir d < |B(x,r)| < c 2 r d . 

Necessarily, d < D since using (3) \B(x,r)\ > (r/2) D ; so c\r d > (r/2) D , < r < 1,. 
Let {B(xi,e))i£i be a minimal covering of M; we have 

I = \M\ < Y,\B{xi,e)\ < N(e,M)c 2 e d . 

iei 

Now if A e is any maximal e— net, we have : 

1 = \M\ > J2 B (^ £ / 2 ) ^ card{A e ) Cl (e/2) d 

As a conclusion, in the case (18): 

II 2™ 1 

_(_)<* < N(s,M) < card(A e ) < —{-) d . (52) 
c 2 e c\ e 
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9.2 Dimension of spectral spaces, covering number, and trace of Pt- 



Let us now use the 'heat kernel' assumptions. The following proposition gives the link between the 
covering number N(6, M.) of the underlying space AA, the behavior of the trace of e~ tL and the dimension 
of the spectral spaces. Let us define: 

Clearly the projector P^ x is a kernel operator and 



Ps x (x,y) = p k{x,y) 



/AT<A 



Then one can prove the following bounds (see [9], Lemma 3.19): For any A > 1, and 5 = j, 

3C 2 ,C 2 , such that |^|<^(^-)<|^ (53) 

Let us recall that Tr(e~ tL ) — J^k e~ Xkt dim(Wk)- In addition we have J M Pt(x,x)dfi(x) = Tr(e~ tL ). 
Moreover , as 

P t (x,x) = P t/2 (x,u)P t/2 (u,x)dfi(u) = (P t/2 (x,u)) 2 d[i(u) 

J M J M 

we have : 

Tr(e~ tL )= [ P t (x,x)d(*(x) = [ [ (P t/2 (x, u)f d^d^x) = \\e-i L f HS . 

JM JM JM 

where || \\hs stands for the Hilbert-Schmidt norm. 
Proposition 4 1. For A > 1, S = j, 

C 'z I in/ X \i d ^( x ) ^ dim(Z' A ) = / P Ex (x,x)dfj,(x) <C 2 I , / x ., dfi(x) (54) 
Jm \B{x,d)\ J M J M \B(x,5)\ 

2. 

2- 2D N(S,M) < 2~ 2D 'card(As) < [ , / es , dfi(x) < 2 D card(A s ) < 2 iD N(8 1 M) (55) 

Jm \B{x,S)\ 

where A$ is any 5— maximal net. 

3. 

Jm \B{x,Vt)\ Jm \B{x,Vt)\ 

Proof of the Proposition: 1. is a consequence of (53). Let us prove 2. : 
Let As be any 5— maximal net. 

Jx L^ sm m^w\ Mx) - L - Jx L m wx-6)\ 

But : 

1 2~ 2D 
xZB(U/2)^B(x,5)cB(t,26), so ^-^ > 

and in the same way : 

xeB(Z,5)^B(Z,6)cB(x,25), so — - - < 



d[i(x) 



\B(x,S)\ ~ \B(£,6)\ 
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This implies 



2- 2D card(A s ) < [ .„/ x ,, d[i(x) < 2 D card(A g ) 



3. is a consequence of (11). 

□ 

The former results can be summarized in the following corollary: 



Corollary 1 



Trace{e^ L ) ~ dim(E x ) ~ N{S,M); 6 = \ 

A 



9.3 Connection between covering number of M. and entropy of M] . 

In this section we establish the link between the covering number N(e,M) of the space M.. and 
-ff(e, , L p ) for p = 2, oo stated in Theorem 1, which we recall here. 

Let us suppose that for some fi > 0, a > 0, < at. Then there exists £o > 0, such that for all 

< £ < e , 



H(e,Mlh 2 )^H(e,mlh^)^N(S(t,e),M)-\og^ where -1- = ^ il g(i). 

Notice, of course, that one can replace N(5(t, e), M.) at any place by car d(Ag( t where Ag^ ^ is a 
maximal S(t,e)— net. Also, since fi(Ai) = 1, we have 

So the proof will be done in two steps: 

We prove the lower bound for iJ(£, , L 2 ) in the next subsection, using Carl's inequality. 
We prove next the upper bound for H (e, Ml , L°°). 

9.3.1 Proof of the theorem: Lower estimates for H(e, , L 2 ). 

Let us recall some classical facts: see the following references [4], [5]. For any subset X of a metric space, 
we define , for any k G N 

efe(X) = inf{£ > 0, 3 2 k balls of radius £, covering X.}. 

Clearly 

£ < e k {X) H(e,X) > k 

Now for the special case of a compact positive selfadjoint operator T : H i-> H we have the following Carl 
(cf [ ] ) inequality relating ek(T(B)) where B is the unit ball of H and the eigenvalues < /ii < //2, • ■ • 
(possibly repeated with their multiplicity order) of T: 



for all k G W, n e N*, e k {T{B)) > 2~& J| fJ,] /n (56) 

i=i 
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In our case, let us take : T = P t/2 , Mi = e~ t/2Xi ,T(B) = Hj. Let us fix 



A = \j 7 log - = -. = - 1 



t °e 8 S(t,e) 
n = dim(S\); fc^nlog 



Carl's inequality gives: 



So 



e log 2 



. „_Jl £ tA ,<i 0(! i t/2\i 



ff(e,H},L 2 ) > fc ~ nlog-— - - dim(S x )\og- 

e log 2 e 



but by (1 ) dim(Ex) ~ N(5,M), 5 = {. So : 



if (e, EL 1 , L 2 ) > log -iV(5,.M), i = A = Jilogi. 

e o Vie 

Proof of the theorem: Upper estimate for H(e, H^, Loo). 

We recall the notations introduced in Section 2 (especially 2.4). Let us suppose : e M < at, [i > 0, a > 0. 
First, we prove that for all e > 0, small enough, there exists <5 (~ 5(t, e) := v/ j log |) such that 

for all /GiJ, H^VI)/ - /||oo < |- 

In a second step we use (16) to expand on the |U(£, <5)|-D|'s: 

${5VL)f(x)= *(SVL)f(Z)\B&6)\Dl(x). 

In a third step, we use a family of points of Si as centers of balls of radius e/2 covering ^(<5\/Z)(Hj) 
so that the balls centered in these points is an e— covering in L°° norm of Mj. 
The next lemma gives evaluations of \\${8y/Z)f - fW^ and W'PiSVL)^)^. 

Lemma 3 for all /eij 
1. 

^L)f\\oo<jij- 4 



j>0 
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Proof of the lemma: 

First, / G Hi so / = E fe Ei«i4(-)e- Afct/2 , EkZi K\ 2 < 1- As $(8VL)(x,y) = Z k $(5^)Pk(x,y), 

$(sVL)f(x) = ($(sVl)( X , .),/(.)> = EE^v^)4ei(*)e~ Afct/2 , hence 



k I 



mVT)f{ x )\ < (££i4lY /2 (£ e - A ^ 2 (^)£( e L.(-) 2 ) 1/2 

k I k I 

k 



k k 

[$ 2 {6^L)(x,x) AP t (x,x)}^ 2 



< y/uwj ye; K i 



using (11), (14) and (3). 
In the same way : 

#(6VL)f(x) = {¥{6VL)(x, .),/(•)> =££<^V^)44(*)e- Afet/2 , hence 



< (E E I4i 2 ) 1/2 (E e ~ Afct ^ 2 (^) E( e i«)) 2 ) 1/2 

k l k I 

< e-^'/^E^^^ft^^)) 172 

fe 

= e -i^[if 2 (jVI)(x,a;)] 1/2 



< C(!f 2 )e"i^ 



< 2 D/2 C{$ 

< e~8? 



is* 



JD/2 

1 



So 



Put A = t^t; as: 



i>o i>o 



/ ^ e -^ 2 ^> 2 #i e -^log2 
j2j a; 



2J 

A2 2 ^ 1 [°° „ D /2„-4x 2Dx _ 1 ^ 4 AD/4 u D/i e -u Du 



^ ~ log 2 7! x log2 2M ; 7.4/4 



as 



So 



✓>oo 

for all a G R,X > 0, / t a ^ 1 eT t dt < 2e~ x X a ~ 1 , ifJf>2(a-l) 
Jx 

oo , 

V2^e" A223 ' < —e-^A- 1 , HA>8(D-2) 

h " log2 

E ||*(2-Wl)/|U < cr"l\\fl^{\)-\n 

j>0 
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First step : Fix S such that ||/ - <£(£-\/£)/||oo < § 
Using the previous lemma, we need to choose S so that 



\>Or-»'^y»'e-*^)-\ | = 5^. 

Let us take : 

At ,1 

then, as e M < at. 



if a is suitably chosen. So for | ~ J j log |, 

||/ -H5^)!\\oo < 



2 



Second step : e— covering of . 

Now if / G M.j, using lemma 3, \\'1>(8\/L)f\\ 00 < jsjj. Moreover &(6y/L)f e Sys, so, using (16) 

$(6VL)f(x)= *(SVL)f(Z)\B(Z,8)\Dl{x). 

Let us consider the following family : 

A*-) = C E he\B(Z,S)\Dl(x)), k 6 G N, |fcd < A' G N, ACe < ^ 

Certainly for all /elj, there exists (fcj) in the previous family such that 

\\${8VZ)f- J2 k s C^\B(t;,5)\Df(x))\\ 00 = \\ Y^i^^fiO-Ck^lB^S^DKx)]^ 

< sup |$(«vT)/(0 - cvi < ^ 

As — /||oo < § , one can cover by balls centered in the fr^\ of radius e. 

The cardinality of this family of balls is : (2K + iyard(A^)_ As 

7 is a structural constant, e M < at 

and 5 ~ 6(t,e), clearly 

HfeM,!. 00 ) <M(S(t,e),M).log- 

e 

9.4 Bounds for E(||W^||^) 

In this section, we prove the following proposition with respect to the sup-norm. Similar bounds in the 
L 2 norm are obtained along the way (even slightly more precise). 

Proposition 5 There exist universal constants C\ and Ci such that 

dN^M) < nw'Wl < C 2 N{Vt,M) sup — 1 (58) 

x£M \B(x, vt)\ 
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We recall that W l writes 

k l<l<dim-H k 

where X l k is a family of independent iV(0, 1) Gaussian variables. Clearly since M. is supposed to have 
measure 1, 

E(||W*||2)<E(||W*|£,). 

As = E fe e-^Ei<K dim m(^) 2 ' we § et 

EdlW^Ha) = VV^dimHfc = Trace(e~ tL ) = [ P t (u,u)dfj,(u). 

Hence using Proposition 4 

C[2- 2d N(Vt,M) < E(||W*||£) = f P t (u,u)dfj,(u) < C^2 4d N(Vt 7 M). (59) 

Jm 

Now, let us first observe, using again Proposition 4, that 

E(|jH/ t ||L)=E(su P |^ t (x)| 2 ) 

> sup E(\W\x)\ 2 ) 
xeM 



™p E E e- Afet (4(*)) 2 

xeA ^ fe l<Z<dim"H fc 
sup Ve^'ftfl,!) 

sup Pt (x, x) ~ sup 



xeM xeM \B(x, viJl 

On the other side, using Cauchy-Schwarz inequality, 



l^)| 2 = lE E e~ Xkt/2 Xle l k {x)\ 2 

k l<l<dimV. k 

<{E E ^/WHE E e- Xkt/ \e l k {x)f} 

k l</<dim'H fc k l<l<dimT-L k 

= {E E e-^^^fiiE^' 72 ^^^)} 

l</<dim'H fc fc 

= {E E e-^/ 2 (^) 2 }P t/2 M- 



l<i<dimKj« 



So 



IW'Wl) < E{£ E e- Afct / 2 (^) 2 }. sup P t/2 (x,x) 

k l<Z<dim"H fc a:eA ^ 

= TraceieT 1 ! 21 ") sup P t / 2 (a;,2;) 

= ( / Pt/2(u,u)dfi(u))( sup P t/2 (x,x)). 
J M xeM 
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Hence, we get 

1 



SU P | D , 1 /T\ I ~ sup P t (x,x) < E(||W"||L) < ( / P t/2 (u,u)dfi(u))(snp P t/2 (x,x)) 
xeM \B(x, \Jt)\ xeM Jm xeM 

sup 



xeM \B(x, 

And we have in addition: N(y/t,M) ~ J M ^^ dfx(x) < sup xeM |B(a ,^ | ■ 
9.5 Lower bound for -Ay(e) 

Theorem 7 For s > fixed, there exists f £ PJ oo(-^)' un ^ ball of the Besov space ) with \\f\\\ = 1 
and constants c > 0, C > such that : 

foralll>t>0, foralll>e>0, inf \\h\\^ > Ge 1 e cte ^' s 

Let us take / such that 

||/|| 2 = l>e>0. 

We are interested in : 

,„ D inf i, H^lli- 

I/— -Ft/2fl , l|2=e 

Let us put 

<%) = ||/- Pt/29\\l = II/II2 - 2{P t/2 f,g) + (P t g,9) = e\ 9{cj) = \\g\\l (60) 

We have, 

D$(g) = -2P t/2 f + 2P t (g), £><%) = 2g 
So, inf V{g) = &(g ) =^ D&(g ) = -/i£><% ) 

<t>(g)=e 2 

with .00 = -fiPt(go) + ^Pt/ 2 f- 

Necessarily n 7^ 0, otherwise go = and <I>(go) = \\f\\ 2 ^> e 2 - Let us put A = i. We necessarily have 
A.90 = ft/2/ ~ ^(ffo), n ence (A + P t ){g ) = P t / 2 f, so 

. 9o = (A + P t )- 1 P t/2 /. 

Let us now write the constraint : 

e 2 = ||/ - Pt/ 2 <7lll = 11/ - ^/2( A + Pr'Pt^fWl = 11/ - (A + Pr'P/lla = II A(A + Pt)" 1 /!!!- 
Clearly : 

A ^ ||A(A + Pi)" 1 /!!! 

is increasing from to ||/|||. As well, 

a^ika + p)- 1 ^/!! 2 
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is decreasing . On the other way : if L = J xdE x , and 



r7iii = M^^) 2 ^/,/)^ 2 



and 



||flb||i = ||(A + Pt)- 1 P t/a /||l= / (— l —) 2 e- tx d(E x fJ). 



(i 



'A + e-* 

Let us recall the following result from [9], Lemma 3.19. 



Theorem 8 There exists b > 1, C\ > 0, C'2 > 0, s«c/i f/iai for all X > 1, 5 = j, then 

(dim(E b x) - dim(Sx) = dim{£ b \Q E\) = / P^ bA (x, x)dfi(x) - / P Sx (x,x)dfi(x) ^ 

J M i« 

and more precisely: 

C "' 1 / ^ ^ d * m ^ G ^ ^ C '" 2 / TrT^v^^- 

As Pb^s = E a , one can built a fonction / € L 2 such that : 

poo 

\\f-p Svs f\\l= / (£;»/, />-n/iii-i|s a /iii = 11/ -K/iii = a- a 

for a = 6 2j ", and j G N. It is enough to have : 

\\P SbS+1 e SbS (f)\\ 2 2 = b- 2js -b-^ +1 > 

and this could be done by the previous theorem. 

Let us choose for e > 0, b~ 2js > 4e 2 > b^ +1)s . So 

/•OO 

(£,/, /> = 6- 2js > 4e 2 > b- 2 « +1 > = / (£,/, /) 
so, if A = e~* a , a = 6 2 ^, 

lo \ + e tx J a \ + e tx 



But 



>/ L-J~'° ,J M^/,/) = 7 / W,/)>^ 



IISoll^lKA + Pt)- 1 ^/!!^ / (y— —fe-^d(E x f,f) 

Jo A + e 

roo 1 />oo -t/2x -t/2a 

= * l (^^fe-^d^f, f) = e- I t-.—t-ME.f, f) 

roc ^ 

> e ta - / e-^d^fJ) > e ta - / e'^^d^/, /) > e 4 ^- / d(J^/,/) 
e**i d(^/,/) = e^l(6-^- 2 ) s -6- 2 ^) 



4 J W -3 

e **l6-W(62- - l) > £ 2 (fe 2s - l) e *^- 2 > e 2 (fe 2s - l)e tce " /s ; c = 4" 1 / s 6- 4 . 



35 



10 Appendix C: Compact Riemannian manifold. 

Let Ai be a compact Riemannian manifold without boundary. Let T>(Ai) be the algebra of infmitly 
diffcrcntiablc functions. Associated to the Riemannian metric p, one defines a measure dx, a gradient 
operator V on T>(Ai) and the Laplace operator A. It holds 

faraU/,a€2?(M), [ Af(x)g(x)dx = - [ \V(f(x))\ 2 dx. 

JM JM 

Thus — A is a positive symmetric operator. So actually 

L 2 =e Afc -H Afc , A =0 < Ai < A 2 < ... 

dim(Hx k ) < oo, Hx k cV{M), f e H Xk <=> Af = -X k f 
One can prove (see [15]) that the semi-group e* 4 is a kernel operator: 

e tA (x,y)=J2e- tXk Pn> k (x,y). 
Moreover this kernel verifies: there exist positive C\, C2, C, c, such that 

for all u,v € M, e~ c ~ — < e tL (u,v) < e~ c ~ — 
y\B(u, V~t)\\B(v, V~t)\ yJ\B{u, V~t)\\B(v, V~t)\ 

and moreover the property of doubling measure is verified. In fact we have a better result: 

Proposition 6 Let Ai be a compact Riemannian manifold of dimension n. Then there exist < c < 
C < 00 such that : 

for all x G M, for all < r < Diam(M), cr n < \B(x,r)\ < Cr n . 

Proof : 

Let [i and p be the (non normalized) Riemannian measure and metric on M.. The proposition is a 
consequence of the Bishop-Gromov comparison Theorem, see [16] and [8]. 
As M. is compact, clearly 

3k e M, such that : for all x <G Ai, Ricc x > (n — l)ng x 

where Rice is the Ricci tensor and g is the metric tensor. Let V K (r) be the volume of the (any) ball of 
radius r in the model space of dimension n and constant sectional curvature k. Let V n be the volume of 
the unit ball of W 1 . 

1. For k > 0, the model space is the sphere -t=S„ of R n+1 of radius A= and 

V K (r) = nV n [ r (^^)^dt; so (-)"- 1 Kr™ < V K (r) < V n r n 
Jo V K 

2. For k = 0, the model space is K™ and 

V K (r) = V n r n 
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3. For k < the model space is the hyperbolic space of constant sectional curvature 



K. 



V K (r) = nV n I C^^ r^dt; so V n r n < V K (r) < ^"e^ 1 )^ 



as s < sinh(s) < se s . 

Moreover by the Bishop-Gromov comparaison comparison Theorem: r i-> y ^ is non increasing. So if 
< £ < r < s < R = diam(M) : 

V K (R) V K (R) ~ V K (s) ~ V K (r) ~ V K (e) 

So 

V K (s) \B(x,r)\ f M ,V H {r) 

So 

A(-) n < \ B J^ r \\ (doubling); cr n < \B(x, r)\ < CV n r n , ; (homogenity) 
s \B(x,s)\ 

far«>0 > O=l,c=(|r 1 ^. A = ( 2 -r-\ 

for K = 0,C=l,c=^l. A = l. 

R n 

for K <0, C = e^VW\R. c= A= 

ijn e (n-l)VW« e («-i)VN fl 

Remark 7 If (Ai, [i, p) is a compact metric space with a Borel measure \i, then if we have the doubling 

condition : 

< r < s \B(x,s)\ < ^-(-) m \B(x,r)\ 

A r 

then 

A\M\ 



for allr<R = diam(M), Cr m < \B(x, r)\, with C 



R r 
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